Affan Hamid

Probability has always been this mysterious idea in mathematics. It’s definition has changed and evolved over the years to now a very abstract way of thinking about uncertainty.

In this article, we begin with an introduction to the rigorous background behind probability, built on measure theory. This foundation gives us confidence that probability isn’t something undefined and can be formulated from the first principles in mathematics. Today we’ll focus on measuring objects, setting up for the next article where we will talk about probability being a measure on events.

Measure Theory

Measure theory is the field of mathematics that formalizes the idea of measuring things.

In the physical world we often appoint real numbers to objects to denote their measure. For example a laptop may weigh 1kg or 2 lbs. Or the length of a ruler is 30cm or 2.5 feet. These numbers that we’re appointing to objects are measures in real life. Measure theory formalizes this idea and grounds it in set theory so that we can build more complicated measures.

The Foundation: The sigma algebra

Before we begin measuring, we need to figure out what objects are measurable. We begin by discussing about two sets: Our universe and a sigma algebra.

From now on, we’ll use the analogy of atoms. Think of atoms as individual elements and think of an object as a set of atoms.

We begin by thinking of a set Ψ (letter: psy). Think of this as all atoms in the universe. Our objects will be formed from the elements of this set.

\psi=\{o, o, o, o, o\dots\}

Each o represents an atom. Now if we take a subset of this set, then we get an object. This object can be a car, a book, a square, etc.

An object is simply a subset of a the universe.

Now think of a collection $\mathbb{G}$ . As a reminder, a collection is simply a set of sets. $\mathbb{G}$ is the collection of all subsets of $\psi$ . In other words, all possible objects we can create from the atoms in $\psi$

Now, $\mathbb{G}$ must follow 3 axioms to be called a sigma algebra. A sigma algebra is a special name of a set in measure theory that ensures that whatever measure we’re working with (size, length, mass, etc) can be properly added, subtracted, and measured without creating inconsistencies.

Three axioms of sigma algebra

Axiom 1:

\emptyset \in \mathbb{G}

We include the idea of nothingness. This is an object with no atoms at all. You can create something like this if you take an object and remove all its atoms. This is analogous to the 0 in real numbers.

Axiom 2:

\forall A \in \mathbb{G} \implies A^c \in \mathbb{G}

This axiom translates to the following sentence: For every set in $\mathbb{G}$ , it’s complement (everything that is not in the set) must also be in $\mathbb{G}$ .

Think of a space with 3 objects: A square, a triangle, and a circle. Suppose we take A to be the square. Then it’s complement (the triangle and the circle) must be just another object in our space.

Axiom 3:

\forall A_1, A_2 \dots \in \mathbb{G}, \bigcup_{i=1}^\infty A_i \in \mathbb{G}

This axiom translates to: For a bunch of sets (possibly infinite of them), the union of the sets belong to $\mathbb{G}$ .

Essentially if you combine objects together, then that combination must also be just another object.

Are these axioms enough?

The only 3 things we’ve defined are the null set, complements, and unions. But how do we know that that’s all we need?

From De Morgan’s law, we can come up with intersection using unions and complements:

A \cap B = (A^c \cup B^c)^c

And now, since we have all the basic operations of set theory, we can pretty much come up with any operation we need in the sigma algebra.

These axioms define what things we can measure and how they should behave with each other so that our measure actually makes sense.

$(\psi, \mathbb{G})$ together form a “measurable space”. Now let’s get to measuring that space.

Learning to measure

Now that we have a measurable space, how do we actually measure it? We have to define a function.

For example, think of mass as a function. It takes in an object (a set of atoms) and produces a real number. The same is true for length, weight, volume, etc.

So our measure is a function from G (the set of all objects / the set of all subsets of $\psi$ ) to the real numbers We give it the name “m”

m:\mathbb{G}\rightarrow \mathbb{R}

It must follow a couple of properties that are intuitive to our understanding of measure:

Axiom 1:

\forall A \in \mathbb{G}, m(A) \ge 0

Essentially our measure cannot become negative. There is no negative mass or negative length.

Axiom 2:

m(\emptyset) = 0

If we have nothing to measure, then our function should return 0.

Axiom 3:

\forall A_1, A-2, \dots \in \mathbb{G}, \text{Where each set is pairwise disjoint}, m\left(\bigcup_{i=1}^\infty A_i\right) = \sum_{i=1}^\infty m(A_i)

While this may seem slightly complicated, it has very intuitive. Essentially if we have a collection of objects with no intersection. Then the combined object’s measure should be the sum of the measures of each object.

Here pairwise disjoint means that if you take any two sets/objects, there should no element/atom shared between them.

Now $(\psi, \mathbb{G}, m)$ is called a measure space.

What’s Next?

In this article, we’ve laid the foundation for understanding how to measure objects in a space. In the next article, we’ll build on this to explain how probability is just a specific type of measure where the total measure of the space equals 1. This connection will show how measure theory gives probability its mathematical rigor.

Stay tuned!